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Abstract —A novel algorithm and implementation of real-time identification and tracking of blob-filaments in fusion reactor data is 
presented. Similar spatio-temporal features are important in many other applications, for example, ignition kernels in combustion and 
tumor cells in a medical image. This work presents an approach for extracting these features by dividing the overall task into three 
steps: local identification of feature cells, grouping feature cells into extended feature, and tracking movement of feature through 
overlapping in space. Through our extensive work in parallelization, we demonstrate that this approach can effectively make use of a 
large number of compute nodes to detect and track blob-filaments in real time in fusion plasma. On a set of 30GB fusion simulation 
data, we observed linear speedup on 1024 processes and completed blob detection in less than three milliseconds using Edison, a 
Cray XC30 system at NERSC. 
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1 Introduction 

A Wide variety of "big data" such as simulations of 
diesel combustion and images of tissue from biopsy, 
are spatiotemporal in nature 0 0 When analyzing these 
data sets, a common task is to find coherent structure in 
space and time, for example, ignition kernels in combustion, 
and cancerous cells in medical images. There are many 
possible approaches to identify such a feature based on the 
application requirements S' B 0. @ However, when 
faced with tight time constraints many of these techniques 
are too slow to produce an satisfactory answer. 

Our work was originally motivated by the need to detect 
spatio-temporal feature associated with instabilities in fu¬ 
sion plasma. Magnetic confinement fusion has the potential 
to be an inexhaustible source of clean energy; and billions of 
dollars have been invested in developing fusion reactors, 
like the ITER project (7). However, steady-state plasma 
confinement is often interrupted by blob filaments driven by 
the edge turbulence. A blob filament (or blob) is a magnetic- 
field-aligned plasma structure that appears near the edge 
of the confined plasma, and has significantly higher density 
and temperature than the surrounding plasma |8[. Blobs can 
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also be considered as outliers because they are rare events 
that convect filaments of plasma outwards towards the 
containment wall, causing substantial heat loss, degradation 
of the plasma confinement, and erosion of the containment 
wall. By identifying and tracking these blob filaments from 
fusion plasma data streams, physicists can improve their 
understanding of the dynamics and interactions of such 
coherent structures (blobs) with edge turbulence. 

Fusion experiments and simulations could easily pro¬ 
duce many terabytes per second; and features such as blobs 
have to be detected in milliseconds in order for the control 
system to have a chance to take mitigating actions. Though 
there are many well known feature extraction methods for 
detecting outliers, they often have some shortcomings. Clas¬ 
sical multi-dimensional outlier detection techniques are de¬ 
signed to detect global outliers. However, these techniques 
do not distinguish between non-spatial attributes and spa¬ 
tial attributes and do not consider apriori information about 
the statistical distribution of the data |9|. Since spatio- 
temporal data types have unique characteristics and their 
relations are more complicated than ordinary data, dedi¬ 
cated outlier detection techniques are typically required to 
examine anomalies in data across space and time j5j. In this 
work, we propose an approach for detecting and tracking 
spatio-temporal features such as blobs by breaking down 
the process into three steps: (1) find cells that satisfying 
application specific requirements, (2) group cells into spatial 
features, and (3) track features by the amount of overlap 
in space. By varying the first step, this procedure could be 
applying to different applications. Earlier, this approach was 
applied to data from regular meshes (5), (10). In this work, 
we will demonstrated that it can also be effectively applied 
to irregular mesh data. 

This work addresses several challenges exemplified by 
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Fig. 1: A real-time data analysis frame for finding blob- 
filaments in fusion plasma data streams 

the detection of blobs in fusion plasma. First off, fusion 
experiments and numerical simulations can easily generate 
massive amounts of data per run. During a magnetic fusion 
device experiment (or "shot"), terabytes of data are gen¬ 
erated over short time periods (on the order of hundreds 
of seconds). In the XGC1 fusion simulation (TT) , (12) , a 
few tens of terabytes can be generated per second. Timely 
access to this amount of data can already be a challenge (13), 
1141, but analyzing all this data in real time is impractical. 
Currently, there are three types of analyses in most of 
fusion experiments: in-shot-analysis, between-shot-analysis, 
and post-run-analysis. All existing blob detection methods 
address post-run-analysis, but in this work, we focus on 
the more challenging first two cases to provide a real-time 
analysis so that scientists can monitor the progress of fusion 
experiments. Figure [l] presents a real-time analysis frame 
for finding blob-filaments in fusion plasma data streams. To 
perform this data analysis in real time, we utilize effectively 
modern supercomputers to address the high volume and 
velocity challenges arising from fusion plasma big data. 

This work has been integrated into the International 
Collaboration Framework for Extreme Scale Experiments 
(ICEE), a wide-area in-transit data analysis framework for 
near real-time scientific applications Jl5’|. ICEE takes advan¬ 
tage of an efficient IO solution ADIOS (16), and a cutting- 
edge indexing solution FastBit jlO), to design and construct 
a real-time remote data processing framework over wide- 
area networks for international collaborations such as the 
ITER project. In this system, a blob detection algorithm 
is used to monitor the health of fusion experiments at 
the Korea Superconducting Tokamak Advanced Research 
(KSTAR). However, existing data analysis approaches are 
often single-threaded, only for post-run analysis, and take 
a long time to produce results. Also, compared to the 
simulation data, the resolution of the raw camera data may 
be coarse, but interesting features can still be identified after 
normalization. In order to meet real-time feedback require¬ 
ment, we develop a real-time blob detection method, which 
can leverage in-situ raw data in the ICEE server and find 
blob-filaments efficiently during fusion experiments. Our 
blob detection algorithm is not limited to KSTAR only, and 
can be applied to other fusion experiments and simulations. 

In this research, we apply the three-step approach to 
detect and track blob structures in fusion data, with the 
goal of achieving millisecond response time on terabytes 
of data. With this response space, it is possible for the 
control system of the magnetic confinement fusion reactor to 
implement mitigating strategies in real time. To the best of 
our knowledge, this is the first time a blob detection method 


could satisfy the millisecond time requirement. Additional 
contributions of this work include: 

• We illustrate how to adopt the three-step approach 
to detect and track blob-filaments as an example of 
spatial-temporal feature on an irregular mesh. 

• We propose a two-phase region outlier detection 
method for finding blob-filaments. In the first phase, 
we apply a distribution-based outlier detection 
scheme to identify blob candidate points. In the 
second phase, we adopt a fast two-pass connected 
component labeling (CCL) algorithm from 1171 to find 
different region outliers on an irregular mesh. 

• We develop a high-performance blob detection ap¬ 
proach to meet real-time feedback requirements by 
exploiting many-core architectures in a large cluster. 

• We propose a scheme to efficiently track the move¬ 
ment of region outliers by linking the centers of the 
region outlier over consecutive frames. 

• We have implemented our blob detection algorithm 
with hybrid MPI/OpenMP, and demonstrated the 
effectiveness and efficiency of our implementation 
with a set of data from the XGC1 fusion simula¬ 
tions. Our tests show that we can achieve linear time 
speedup and complete blob detection in two or three 
milliseconds using a cluster at NERSC. In addition, 
we demonstrate that our method is more robust 
than recently developed state-of-the-art blob detec¬ 
tion methods in 1181, 1191. 

The rest of paper is organized as follows. In Section 
II, we give the problem formulation of the blob detection 
and discuss related work. In Section III we describe in 
detail our three-step approach consisting of a two-phase 
region outlier detection algorithm and a tracking scheme 
for identifying and tracking blobs. We then present a real¬ 
time blob detection approach by leveraging MPI/ OpenMP 
parallelization in a large cluster in Section IV. The blob 
detection and tracking results and its real time evaluation 
are shown in Section V. We conclude the paper, and give 
our future plans in Section VI. 

2 Problem Definition and Related Work 

In this section, we introduce our problem definition and 
discuss previous work related to our study. For related 
work, we first discuss existing research work on outlier 
detection, and then review previous work on blob detection 
in fusion plasma domain. 

2.1 Problem Definition 

Extracting spatial-temporal features play an important role 
in analyzing scientific and engineering applications, includ¬ 
ing behavior recognition |20|, bioinformatics |21|, video 
analysis c (22) , and health informatics (23) . Depending on 
the applications, mining spatial features in one time frame 
and relationships among spatial objects in and across time 
frames are extremely challenging tasks due to three reasons. 
First, the extent and shape of a feature could be an important 
indicator in determining its influence. However, due to 
various data type (regular and irregular), it is not easy 
to apply a generic approach for all applications. Second, 
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Fig. 2: A contour plot of the local normalized density in the 
region of interests in one time frame in fusion experiments 
or numerical simulations. A cross-section of the torus is 
called a poloidal plane. R and Z are cylindrical coordinates 
and the major radius of the torus is denoted by R. 


effectively incorporating the temporal information in the 
overall analysis is a necessity to uncover interesting up¬ 
coming events. Finally, how to process very large data sets 
in real time demands appropriately responding to extreme 
scale computing and big data challenges. In this work, we 
attack this problem by presenting a three-step approach 
for detecting and tracking spatio-temporal features in the 
context of blob-filament detection in fusion plasma. 

The definition of a blob is varied in the literature de¬ 
pending on fusion experiments or simulations as well as 
available diagnostic information for measurements |8|. This 
makes blob detection a challenging task. Figure [ 5 ] plots local 
normalized density distribution in the regions of interest in 
one time frame. We can observe that there are two reddish 
spots located at the left portion of the figure, which are 
associated with blob-filaments and are significantly different 
from their surrounding neighbors. It is clear that a reddish 
spot is not a single point but a group of connected points or a 
region. Therefore, we formulate the blob detection problem 
as a region outlier detection problem. Similar to the spatial 
outlier ||9j, a region outlier is a group of spatial connected 
objects whose non-spatial attribute values are significantly 
different from those of other spatial surrounding objects in 
its spatial neighborhood. Figure [ 5 ] shows blobs are region 
outliers. The number of region outliers detected is deter¬ 
mined by pre-defined criteria provided by domain experts. 

The problem is to design an efficient and effective 
approach to detect and track different shapes of region 
outliers simultaneously in fusion plasma data streams. By 
identifying and monitoring these blob-filaments (region 
outliers), scientists can gain a better understanding about 
this phenomena. In addition, a data stream is an ordered 
sequence of data that arrives continuously and has to be 
processed online. Due to the high arrival rate of data, the 
blob detection must finish processing before the next data 
chunk arrives |24|. Therefore, another critical problem is 
to develop a high-performance blob detection approach in 
order to meet the real-time requirements. 


2.2 Outlier Detection 

The problem of outlier detection has been extensively 
studied and can be generally classified into four cate¬ 
gories: distance-based, density-based, clustering-based, and 
distribution-based approaches |3], (25) . 

Distance-based methods (26| use a distance metric to 
measure the distances among data points. If the number 
of data points within a certain distance from the given 
point is less than pre-defined threshold, then this point is 
determined as an outlier. This approach could be very useful 
with accurate pre-defined threshold. However, it may not 
be proper to use a simple threshold if different densities in 
various regions of the data exhibit across space or time. 

Density-based methods (27) assign a local outlier factor 
(LOF) to each sample based on their local density. The LOF 
determines the degree of outlierness, where samples with 
high LOF value are identified as outliers. This approach 
does not require any prior knowledge of underlying dis¬ 
tribution of the data. However, it has a high computational 
complexity since pair-wise distances have to be computed 
to obtain each local density value. 

Clustering-based methods (28) conduct clustering-based 
techniques on the sample points of the data to characterize 
the local data behavior. Since this method does not focus on 
outlier detection, the outliers are produced as by-products 
and it is not optimized for outlier detection. 

Distribution-based methods (9) applies machine learning 
techniques to estimate a probability distribution over the 
data and develop a statistical test to detect outliers. These 
methods use all dimensions to define a neighborhood for 
comparison and typically do not distinguish non-spatial 
attributes from spatial attributes. 

In the context of data streams, a line of research has been 
devoted to develop efficient outlier detection techniques 
(24) , (29) , (30) , (3l) , (32) , (33) . But their main focus is to solve 
the problem of event detection in sensor network |291, query 
processing (30), (32), clustering (3lj, and graph outliers (33) . 
Therefore, these methods cannot be easily generalized to 
region outlier detection problems. In addition, the problem 
of blob detection presents a special challenge, because the 
spatiotemporal attributes of the blob-filaments has to be 
considered together to study their various characteristics 
including speed, direction, movement, and size. More im¬ 
portantly, these methods are mostly single-threaded which 
cannot cope with real-time requirements in fusion plasma. 

A number of distributed outlier detection methods have 
also been studied in (29), (34) , (35) , (36) , (37) . Most of 
these methods are seeking an efficient way to parallelize 
classical outlier detection methods such as distance-based 
outliers (35) , EL distribution-based outliers (29) , density- 
based outliers FT density-based outliers (36) , and PCA- 
based techniques (34) . However, there methods are not 
generally applicable to region outlier detection and tracking. 
In particular, in order to tackle high volume and velocity 
challenges arising from fusion plasma big data, specialized 
outlier detection scheme and suitable high performance 
computing technique are demanded to complete blob de¬ 
tection in the order of milliseconds. 

In the first two steps of our proposed approach, we 
apply distribution-based outlier detection to detect outlier 
points by considering only non-spatial attributes and then 
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leverage fast CCL to construct the region outliers by taking 
into account spatial-attributes. We choose distribution-based 
outlier detection since it can solve the problem of finding 
outliers efficiently if an accurate approximation of a data 
distribution can be properly found |9|, (29]. Normally the 
distribution of the stream data may change over time |5]|. 
However, this assumption may not hold in fusion experi¬ 
ments since a fusion experiment lasts very short time period 
from a few seconds to hundreds of seconds. Therefore, 
we consider the simpler problem of fixed distribution pa¬ 
rameters, noting that several fusion devices have shown 
similar distribution functions of blob events. Then we can 
perform exploratory data analysis to compute best fitted 
distribution parameters offline and then build an accurate 
online distribution model. We leave the more complicated 
problem of real-time distribution estimation for future work. 

2.3 Blob Detection in Fusion Plasma 

Independently, fusion blob detection problems have been 
researched by the physics community in the context of co¬ 
herent structures in fusion plasma |8[. Various post-run blob 
detection methods have been proposed to identify and track 
these structures, to study the impact of the size, movement 
and dynamics of blobs. A plasma blob is most commonly 
determined by some threshold, computed statistically in the 
local plasma density signal (38), (39), (40). However, the 
exact criteria have varied from one experiment to another, 
which reflects the intrinsic variability and complexity of the 
blob structures. In (38) , a conditional averaging approach is 
applied to analyze spatio-temporal fluctuation data. When 
the vorticity is larger than one standard deviation at some 
time frame, a blob is considered to be detected by the probe. 
In 1391, the conditional averaging technique is also used to 
study the evolution of the blob-filaments using Langmuir 
probes and a fast camera. 

Without using a conditional averaging technique, (40) 
searches for blob structures can be done using local mea¬ 
surements of the 2D density data obtained from a 2D probe 
array. Identification of a blob is based on the choices of 
several constraints such as the threshold intensity level, the 
minimum distance of blob movement, and the maximum 
allowed blob movement between successive frames. The 
trajectories of the different blobs can be computed with 
the blob centers based on identification results in each time 
frame. The seminal work by Zweben, et. al. (40) was the first 
attempt to take only individual time frame data into account 
to detect blobs and track their movements, although the 
process of identification of a blob was somewhat arbitrary 
and oversimplified. 

Due to the emergence of fast cameras and beam emission 
spectroscopy in the last decade, the situations of insufficient 
diagnostic access and limited spatial and temporal resolu¬ 
tion have been greatly improved. In the context of computer 
version, a number of methods have been developed to tackle 
blob detection problem, which is aimed to detect points or 
regions in the image that either brighter or darker than the 
surrounding (4l) . Among them, scale-space methods based 
on the Laplacian of Gaussian (42) , (43) , (44) and Watershed 
detection methods based on local extrema in the intensity 
landscape (45) are two main classes of blob detectors. In 


(46) , Love and Kumath made the first attempt to apply an 
image analysis using Watershed techniques for identifying 
blobs in fusion plasma. The images are first processed to 
remove the noise spikes, followed by further smoothing 
using a Gaussian filter, and then identified by various image 
segmentation techniques. However, due to noise and lack of 
a ground truth image, this approach can be sensitive to the 
setting of parameters, and it is hard to use generic method 
for all images. In addition, the output from visualization is 
not convenient to feed into other analysis (2). The regions of 
interest computed from this work can be more conveniently 
fed into other analyses. For instance, one can compute blobs 
in the regions of interest very quickly and transmit these 
compact meta information over internet to remote domain 
scientists for real-time analysis. 

Recently, several researchers (18), [19j, (47) have devel¬ 
oped a blob-tracking algorithm that uses raw fast camera 
data directly with GPI technique. In 118), 1191, they lever¬ 
age a contouring method, database techniques and image 
analysis software to track the blob motion and changes 
in the structure of blobs. After normalizing each frame 
by an average frame created from roughly one thousand 
frames around the target time frame, the resulting images 
are contoured and the closed contours satisfying certain size 
constraints are determined as blobs. Then, an ellipse is fitted 
to the contour midway between the smallest level contours 
and the peak. All information about blobs are added into 
a SQL database for more data analysis. This method is 
close to our approach but it can not be used for real-time 
blob detection since they compute time-averaged intensity 
to normalize the local intensity. Additionally, only closed 
contours are treated as blobs, which may miss blobs at 
the edges of the regions of interest. Finally, these methods 
are still post-run-analysis, which cannot provide real-time 
feedback in fusion experiments. 

3 Our proposed approach 

Given a fusion data stream, which consists of a time or¬ 
dered sequence of sample frames that arrive continuously 
from fusion experiments or numerical simulations through 
remote direct memory access protocols. Our data sets are 
simulated electron density from the fusion simulation code 
XGC1 [111, (12 [. In the present data sets, simulation data is 
captured every 2.5 microseconds for a total time window of 
2.5 milliseconds. Each point Si E S in a time frame t has a 
spatial attribute (r, z, t ) which defines its location in a trian¬ 
gulated measurement grid, and some non-spatial attributes 
including all important plasma quantities such as electron 
density n e (r,z,t) as well as connectivity information in a 
poloidal plane. The spatial neighborhoods are defined for 
each point from the connectivity database in a triangulated 
grid. Formally, an region outlier responding to a blob is 
defined as a spatial area in the regions of interest where 
a subset Bi C S is a group of connected points s*. 

Our overall goal is to develop an approach to detect 
and track spatial region outliers (blobs) over time using 
a stream of fusion data. To achieve this, we break down 
the process into three steps: (1) find outlier points in the 
region of interests, (2) group these outlier points as different 
region outliers, and (3) track these regions outliers by the 
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Fig. 5: An example of exploratory data analysis to analyze 
the underlying distribution of the local normalized density 
over all poloidal planes and time frames. 


Fig. 3: Two-phase region outlier detection for finding blobs 



Fig. 4: An example of the regions of interest and the compar¬ 
ison between refined and original triangular mesh vertices 
in the R (radial) direction and the Z (poloidal) direction. 


overlapping in space. We address the first two steps by 
presenting a two-phase approach, as shown in Figure [3] 
In the first phase, we apply a distribution-based outlier 
detection algorithm to the fusion data stream in order to 
detect outlier points which have significantly higher non- 
spatial attributes than other points. The outputs of this step 
are tuples (s*, n e (r*, Zi, t)), the 2D spatial attributes, and 
non-spatial attributes such as electron density. These tuples, 
as well as connectivity information, are used as input for the 
second phase, where region outlier are detected by applying 
a fast CCL (17) to efficiently find different connected compo¬ 
nents on the triangular mesh. The outputs of the CCL-based 
region outlier detection algorithm are a set of connected 
components with outlier points inside, which are associated 
with blobs if some criteria are satisfied. We address the last 
step by proposing an efficient blob tracking algorithm by 
leveraging cues from changes of blobs area and distance 
of center of blobs. Note that, by varying the first step, this 
procedure could be applying to different applications. 

In the following section, we describe the proposed two- 
phase region outlier detection in detail. 


non-spatial attributes and consider the statistical distribu¬ 
tion of the non-spatial attributes to develop a test based on 
distribution properties, since it is more suitable for detecting 
spatial outliers (5). As claimed in (29), it is very efficient to 
find outliers by using a data distribution approximation if 
we estimate the underlying distribution of data accurately. 
Values for various criteria are determined by domain ex¬ 
perts or subjectively by examining the resulting plotting and 
adjusting them until satisfied. 

In the proposed outlier detection we firstly preprocess 
the sample frame to compute needed quantities in the region 
of interests, as shown in Figure [4a] Then it is analyzed 
by normalizing the total electron density n e (r,z,t) (which 
includes fluctuations) with respect to the initial background 
electron density, n e (r,z, 1) (if using real diagnostic data 
from, e.g. GPI, actual emission intensity I(r,z,t) would 
be used instead of electron density). Note that using the 
initial time frame as the benchmark is an important factor 
to achieve real-time blob detection. The normalized elec¬ 
tron density in the subsequent time frames can be easily 
computed, especially compared to the time-average electron 
density with a long time interval |19|. 

Algorithm 1 Triangular mesh refinement algorithm 

Input/output: 

triGrid : connectivity array of the triangular mesh 
(r, z)\ spatial coordinate of each point 
n e \ normalized electron density of each point 
1: Compute unique edges E and indices vector Ie by 
sorting and removing duplicates based on triGrid 
2: Compute spatial coordinate of each new vertices in the 
middle of E based on (r, z) 

3: Compute electron density of each new vertices on E by 
performing linear interpolation based on n e 
4: Compute indices for each new vertices by adding vector 
index Ie with the number of original points 
5: Compute a new triangular mesh by assigning appropri¬ 
ate indices from each new and old vertices 


3.1 Distribution-Based Outlier detection 

The main task of this phase is to perform efficient outlier 
detection to determine outlier points which form the region 
outliers associated with blobs. To facilitate this goal, we pro¬ 
pose a novel distribution-based outlier detection algorithm 
based on the electron density with various criteria for fusion 
plasma data streams. We separate spatial attributes from 


To obtain meaningful region outliers using the CCL 
method, it is necessary to have fine grained connectivity 
information. This particular simulation mesh has coarse 
vertical resolution, so resolution enhancement techniques 
are applied to generate a higher resolution triangular mesh 
based on the original triangulated mesh. As shown in Algo¬ 
rithm [lj the resulting triangular mesh is refined to achieve 
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four times better granularity. We split each original triangle 
into four smaller ones by linking three middle points of 
the original mesh edges in each triangle. The corresponding 
density of generated vertices can be obtained using linear 
interpolation of the original triangular mesh. This step can 
be applied recursively until the satisfactory resolution of the 
triangular mesh is computed. Figure [4b] shows the resulting 
triangular mesh vertices after applying the triangular mesh 
refinement algorithm once. 

In order to apply an appropriate predefined quantile in 
two-phase distribution-based outlier detection, it is advised 
to perform exploratory data analysis to exploit main charac¬ 
teristics of the data sets. Figure [5] reveals that extreme value 
distribution and log normal distribution are fitted best with 
one of our sample data sets (after comparing over sixteen 
different common distributions). After analyzing the under¬ 
lying distribution, a novel outlier detection is performed to 
determine outlier points in the regions of interest. The basic 
idea of the proposed two-step outlier detection is motivated 
from the observations that there are relatively high density 
areas (a half banded ellipse area with cyan color) in the edge 
and several significantly high density small regions (a few 
small areas with reddish yellow color) in these relatively 
high density areas, as shown in Figure [2] The proposed 
outlier detection method extends the previous approach 
that applies statistical detection with conditional averaging 
intensity value (38j, (39][, and applies more intelligent outlier 
detection with only considering individual time frame data. 
Compared to traditional single threshold segmentation ap¬ 
proach, our approach is more generic, flexible and easier to 
tune a satisfactory result. 

In the first step, the standard deviation a and the ex¬ 
pected value /i are computed over all sixteen poloidal planes 
in one time frame. Using the best fitted distribution, we 
apply first step outlier detection to identify the relative high 
density areas with a specified predefined quantile: 

N(ri,Zi,t) ~(i> a*a,V(ri,Zi) eT (1) 

where N is the normalized electron density, a is the multiple 
of a associated to the specified predefined quantile and T 
is the domain in the region of interests. Once the relative 
high density regions are determined, we compute another 
standard deviation and the expected value /li 2 in these 
areas. Then we employ second step outlier detection to 
identify the outlier points in the relative high density areas 
with an appropriately chosen predefined quantile: 

N(ri,Zi,t ) 1^2 > /?*<x 2 ,V(ri,Zi) £ T 2 (2) 

where /3 is the multiple of cr 2 associated to the judiciously 
chosen confidence level and T 2 is the domain of blob can¬ 
didates. In practice, a and f3 can be chosen to be same or 
different, depending on the characteristics of blob-filaments. 
In our experience, the a value is generally greater than (3 
since the standard deviation a over the region of interests 
is much smaller than the standard deviation from the 
relative high density areas. 

However, two-step outlier detection alone cannot be 
used to distinguish the blob candidates since identified blob 
candidates may actually have small density, which does not 
satisfy traditional definition of blobs. Therefore, the density 
of the mesh points in the outlier points smaller than a 


certain minimum absolute value criterion need to be filtered 
out. On the other hand, it is also possible that the middle 
areas between surrounding plasmas and outlier points have 
density higher than the given minimum absolute value 
criterion. Thus, we also apply a minimum relative value 
criterion to remove these unwanted points. To combine 
these two rules together, we have a more robust and flexible 
criterion: 

N(n,Zi,t) > max(d ma , (d mr * /j 2 )),V(ri, Zi) £ r 3 (3) 

where d ma and d mr are minimum absolute value and min¬ 
imum relative value respectively, and Ts is the domain of 
good blob candidates. 

3.2 CCL-Based Region Outlier Detection 

The main task of the second phase is to apply an efficient 
connected component labeling algorithm adopted from (T7| 
on a refined triangular mesh to find different blob candi¬ 
date components. A connected component labeling algo¬ 
rithm generally considers the problem of labeling binary 2D 
images with either 4-connectedness or 8-connectedness. It 
performs an efficient scanning technique, and fills the label 
array labels so that the neighboring object pixels have the 
same label. Wu |[l7) presents an efficient two-pass label¬ 
ing algorithm that is much faster than other state-of-the- 
art methods and theoretically optimal. However, since we 
process a refined triangular mesh rather than the traditional 
2D images, we have modified their CCL algorithm to take 
the special features of a triangular mesh into account. As 
shown in Algorithm [2j each triangle is scanned first rather 
than a point. Since we know the three vertices in a triangle 
are connected, we can reduce unnecessary memory accesses 
once any vertex in a triangle is found to be connected with 
another vertex in a different triangle. Then we compute the 
current minimum parent label in this triangle, and assign 
each vertex a parent label if its label has already filled or a 
label if its label has not initialized yet. If all three vertices 
in a triangle are scanned for the first time, then a new 
label number is issued and assigned to their labels and the 
associated parent label. After the label array is filled full, 
we need flatten the union and find tree. Finally, a second 
pass is performed to correct labels in the label array, and all 
blob candidates components are found. Note that to perform 
efficient union-find operations, the union-find data structure 
is implemented with a single array as suggested in p7) . 

After all blob candidates are determined, a blob is 
claimed to be found if the median of a blob candidate 
component satisfies a certain minimum absolute median 
value criterion. The reason we are setting this constraint to 
select the blobs is that the minimum value criterion has to 
be a reasonably small value in order to produce more blob 
candidate components. It is possible that if the minimum 
absolute median value criterion is too large, it may also 
remove the blobs. On the other hand, it is also possible if this 
value is too small, it does not have effect on filtering out un¬ 
wanted components. Therefore, with the same philosophy 
of measurement, a minimum relative median value criterion 
is also applied to determine the blobs. However, in order to 
protect the blobs from being removed due to the extremely 
large mean value /i 2 / we also set the maximum absolute 
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Algorithm 2 Connected component labeling algorithm on 
triangular mesh to find various blob candidates components 

Input: 

triGrid : connectivity array of the triangular mesh 

Output: 

B c : Region structure of each blob candidate 
1: Initialize label, parentLabel, and labnum 
2: for Scanning each triangle until the end of triGrid do 
3: if label of three vertices are all zero then 

4: Assign a new labnum to all three vertices 

5: Update label and parentLabel with labnum 

6: else 

7: Find the minimum parentLabel of all three vertices 

8: Update their label and parentLabel with this value 

9: end if 

10: end for 

11: for Scanning until the end of parentLabel do 

12: Update parentLabel by flattening union-find tree 

13: end for 

14: for Scanning until the end of Label do 
15: Update Label with latest parentLabel 

16: end for 

17: Find each B c of points with same parentLabel 


median value criterion to limit the power of minimum 
relative median value criterion. We unify these three rules 
to be one: 

N(n, Zi, t) > ma x(d ma , min({d mr * /u 2 ), d xa )), 

v(r i ,2 i )e r 4 (4) 

where d ma , d mr and d xa are minimum absolute and rela¬ 
tive median values and maximum absolute median value 
respectively and T 4 is the domain of blobs. 

3.3 Tracking Region Outliers 

The objective of the third step is to track the direction and 
speed of the detected blobs over time. The blob tracking 
algorithm has to cope with the problem of tracking multiple 
region outliers simultaneously even when the blobs merge 
together or split into separated ones. On the other hand, the 
blob tracking method should be simple and efficient to meet 
real-time requirements. To achieve this goal, we propose an 
efficient blob tracking algorithm by leveraging cues from 
changes of blobs area and distance of center of blobs. We 
compute the correspondence between previously tracked 
blobs and currently detected blobs, and then recover the 
trajectories of the tracked blobs. 

To identify the location center of detected blob, we com¬ 
pute the density-weighted average of the spatial coordinates 
of each point inside a blob. 

1 n 

(r c ,z c ) =—Y,( r ’ z ) n e ( * * 3 4 5 ) 

i= 1 

where M is summation of n e of all points in a blob. The 
density-weighted average is used to better capture the cen¬ 
ter of density of a blob. We track the movement of these 
detected blobs by linking the centers in consecutive time 
frames. In order to obtain the boundary of region outliers 


(blobs), we compute the convex hull |48| of a set of points 
in a blob. The area of a blob is computed by counting the 
number of points in a blob. 

Algorithm 3 Efficient blob tracking algorithm 

Input: 

B: Current detected blobs 
T : Previous blob tracks 

Output: 

T : Updated blob tracks with B appended 
1: Initialize hull, cen, and area 
2: hull = getBoundary(T?) 

3: cen = getCenter(T?) 

4: area = getArea(F>) 

5: for Scanning until the end of B do 
6: cenDis = getCenterDis(F>,T) 

7: areaDif = getAreaDif(T?,T) 

8: if cenDis < maxjump A areaDif < maxDif then 

9: Find a blob track T with smallest cenDis 

10: Append current blob into this blob track T 

11: end if 

12: end for 

13: Update T with hull, cen, area, and computed speed 


As shown in Algorithm 3j the input parameters are 
current detected blobs and the previous blob tracks. The 
data structure of a blob track is composed of the track ID, 
the length of track, the area of previous blob, the time- 
stamps, the center points, the boundary points, and the 
velocity. There are two heuristics to verify whether a blob 
is associated with an existing blob track. The first heuristic 
is based on the fact that the area of a blob between consec¬ 

utive time frames cannot decrease or increase significantly. 
The second heuristic takes into account the distance of the 
centers of a blob does not change dramatically over very 
short time period (microseconds). The proper thresholds 
for these two heuristics are provided by domain experts. 
Since blobs can appear, disappear, merge together or split, 
a greedy scheme is applied to find the best matching pair 
of blob and track based on closest distance of the centers 
of current detected blob and the latest blob in a blob track. 
Based on computed correspondence between a blob track 
and the currently detected blobs, existing blob tracks are 
automatically processed through corresponding operations 
such as adding a blob into a track, creating a new track, 
and a track ending. If the length of a track is smaller than 

3 consecutive time frames, the track will be treated an 
anomaly and deleted due to errors in data or inappropriate 
blob detection thresholds. The speed and direction of the 
blobs can thus be computed from two consecutive center 
points. Finally, we can recover the trajectories of the tracked 
blobs by monitoring the movement of blob centers. 

4 A REAL-TIME BLOB DETECTION APPROACH 

Existing blob detection approaches cannot tackle the two 
challenges of the large amount of data produced in a shot 
and the real-time requirement. In addition, existing data 
analysis approaches are often operated in a single thread, 
only for post-run analysis and often take a few hours to gen¬ 
erate the results (49) . In order to meet the real-time feedback 
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Fig. 6: Hybrid MPI/OpenMP parallelization 


requirement, we address these challenges by developing 
a high performance blob detection approach, which can 
leverage in situ raw data and find blob-filaments efficiently 
in fusion experiments or numerical simulations. 


4.1 A hybrid MPI/OpenMP parallelization 

The key idea is to exploit many cores in a large cluster 
system by running MPI to allocate n processes to process 
the data in one or several time frames at the high level, 
and by leveraging OpenMP to accelerate the computations 
using m threads at the low level. Figure 6] shows our hybrid 
MPI/OpenMP parallelization for blob detection. Using this 
approach, we can complete our blob detection in a few 
milliseconds with in situ evaluation. 

In order to achieve blob detection in real time, the goal 
is to minimize data movements in the memory and speed 
up computation. Ideally, the performance is optimal without 
any communication if we can perform the job correctly. The 
proposed blob detection algorithm in the previous section 
supports embarrassingly parallel since we only need the 
initial time frame and the target time frame to do the com¬ 
putation. This is an important difference between our blob 
detection method and recently developed methods |18|, 1191 
in terms of real-time requirement. Furthermore, we explore 
many-core processor architectures to speed up the compu¬ 
tation of each MPI task by taking full advantage of multi¬ 
threading in the shared memory. Therefore, our real-time 
blob detection approach based on hybrid MPI/OpenMP 
parallelization is a natural choice and is expected to provide 
the optimal performance for fusion plasma data streams. 

A practical interesting issue is how to tune the number 
of MPI processes and OpenMP threads for the best perfor¬ 
mance by taking both analysis speed and memory size into 
account. As shown in Figure [7j we vary the number of MPI 
processes and OpenMP threads but fix the total number to 
be 24 for investigating the performance when processing the 
same amount of time frames data. A faster analysis speed 
is achieved when increasing the number of MPI processes 
since more data frames can be processed simultaneously. On 
the other hand, the analysis speed remains constant with 
a few OpenMP threads and degrades with more OpenMP 
threads due to lack of enough computation in one time 
frame. However, more OpenMP threads could significantly 
reduce the memory demands. Therefore, in this study, we 
choose the number of OpenMP threads to be four for each 
MPI task, to achieve a good trade off between analysis speed 
and memory savings. 


Total number 24 of MPI x OpenMP 



Fig. 7: Investigate the performance of hybrid MPI/OpenMP 
parallelization when varying number of MPI processes and 
OpenMP threads. The blue triangle denotes only normal¬ 
ized blob detection time. The red star denotes the normal¬ 
ized total time including both blob detection time and initial 
communication time for broadcasting the first time frame to 
all analysis nodes for normalization. 


4.2 Outline of the implementation 

We implement our blob detection algorithm in C with a 
hybrid MPI/OpenMP parallelization. Algorithm [3] summa¬ 
rizes the proposed blob detection algorithm without con¬ 
sidering OpenMP. Users can specify the regions of interest 
by (Rmin, Rmax, Zmin, Zmax), the range of target time 
frames by (t_start, t_end), and the location of the data sets. 
However, with in situ evaluation, there is no need to specify 
the file location since all data are already in memory. We 
use static scheduling to evenly divide the number of time 
frames for each MPI task for efficiency. The n MPI processes 
are allocated to process one or several time frames and m 
OpenMP threads are launched to accelerate the computation 
in one time frame. Note that the MPI process is also the 
master thread in the runtime environment. At the begin¬ 
ning, the initial time frame data is broadcasted to all MPI 
processes so that normalization can be performed with new 
coming time frames. Then each MPI process embarrassingly 
process the data in each time frame with multithreading 
in the shared memory. Only detected blobs information are 
maintained and added into local database. Since these local 
blobs information are very compact, they can be efficiently 
transmitted over internet to remote servers for real-time 
analysis by domain scientists. 

5 Experiments and Results 

In this section we present experimental evaluations of our 
blob detection and tracking algorithms, and report the per¬ 
formance of the real-time blob detection under both strong 
and weak scaling. Before showing experimental results in 
the next section, we briefly introduce our experimental 
environment, data sets, and parameter setting in our blob 
detection and tracking algorithms. We have tested our im¬ 
plementation on the NERSC's newest supercomputer Edi¬ 
son, where each compute node has two Intel "Ivy Bridge" 
processors (2.4GHz with 12 cores) and 64 GB of memory. 
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Algorithm 4 A real-time blob detection approach 

Input: 

Rmin, Rmax, Zmin, Zmax: specify region of interest 
t_start, t_end : start and end time frames 
FileDir: location where data sets locate 

Output: 

B : Detected region outliers (blobs) 

1: Apply static scheduling to assign equal amount of n 
time frames data to each MPI process 
2: Broadcast the initial time frame to all MPI processes 
3: for t = 1 : n do 

4: Process i loads raw data in one frame and computes 

normalized density n e (r, z, t) in region of interest 
5: Refine the triangular mesh. See Algorithm [l] 

6: Apply two-phase distribution-based outlier detection 

to identify outliers with various criteria 
7: Apply CCL-based region outlier detection on a trian¬ 

gular mesh to find blob components. See Algorithm [2] 

8: A blob is added into B if certain criteria is satisfied 

9: end for 


Our base data sets are simulation data sets with 1024 time 
frames based on the XGC1 simulation (Tl) (12) from the 
Princeton Plasma Physics Laboratory, which last around 
2.5 milliseconds. One of our main goals is that we can 
control analysis speed by varying the number of processes 
to complete the blob detection on the entire data set in a 
time close to 2.5 milliseconds. It would indicate that our 
algorithm could monitor fusion experiments in real time 
(neglecting data transfer latency). If we consider internet 
transfer latency in real experiments or numerical simulation, 
the system needs at least 1 to 25 milliseconds to transfer one 
time frame data depending on size of data, which may give 
us more time for data analysis. 

Another goal is to validate the effectiveness of the pro¬ 
posed algorithms. In Algorithm |4j we apply various criteria 
to identify the blobs. The parameters for blob detection 
and tracking in our experiments are given in Table [l] One 
criterion we have not mentioned in the previous section is 
parameter "minArea". This parameter is used to decide how 
many points a blob should have, which is used to remove 
impossibly small blobs. In our experiment, this parameter is 
set to three since there are at least three vertices connected 
as a 2D component in a triangular mesh. Another criteria 
are parameters "maxFrames" and "minFrames", which are 
used to control the length of a blob track and remove noisy 
tracks. It is important to note that these parameters need 
to be tuned in order to achieve optimal performance in 
different fusion experiments or numerical simulations. The 
reasons for this uncertainty in the context of blob detection 
are from the intrinsic variability and complexity of the blob 
structures observed in different experiments (8). 

5.1 Performance comparison 

We first conduct experiments to compare our method with 
recently developed state-of-the-art blob detection methods 
in 1181, 1191. Since their methods are based on the con¬ 
touring methods and thresholding, we call their methods 
the contouring-based methods. We have to point out that 


TABLE 1: Parameters setting for the proposed blob detection 
and tracking algorithms on XGC1 simulation data sets 


detection criteria 

tracking criteria 

minArea 

3 

maxAreaChange 

25 

minRden (d ma ) 

1.2 

max Jump 

0.04 

minAbsden (<i mr ) 

2.05 

maxFrames 

100 

maxAbsMden ( d xa ) 

2.75 

minFrames 

3 

minMden (d ma ) 

1.3 



minAbsMden (d mr ) 

2.15 




strictly quantitative comparisons are not possible since the 
blob itself is not well-defined (8). Due to this reason, there 
are rarely direct comparisons between any new proposed 
method and existing ones in the literature in the domain 
of fusion plasma |l8j, |1J9), (38) , (39) , [40), (46), (47), ||49|. 
However, in order to demonstrate that our methods is more 
robust than the contouring-based methods, we compare 
these two methods in two typical cases to shed light on their 
performance in terms of the detection accuracy. 

Figure [8] shows the comparison of the blob detection 
results between our region outlier detection method and the 
contouring-based methods in two different time frames. As 
shown in Figures [8a 


8b we can see that our region 


and 


outlier detection method does not miss detecting the blob 
at the edge of the regions of interest while the contouring- 
based methods fail the detection. The reason is that the 
contouring-based methods require the computed contours 
are closed, which do not exist at the edge of the regions of in¬ 
terest. In Figures [8c] and [8dJ we notice that our region outlier 
detection method can accurately detect all blobs. However, 
the contouring-based methods either yield the blobs with 
incorrect areas (much larger or smaller), or misdetect the 
wrong area as a blob. This is because that it is hard to 
use one single threshold to identify the blobs for various 
time frames even in the same experimental data. Our region 
outlier detection method does not have such problem since 
we use more flexible distribution-based outlier detection. 


5.2 More blob detection results 

We perform more experiments to comprehensively examine 
the blob detection results in five continuous time frames and 
four different poloidal planes as shown in Figure [9] As we 
can see from the figure, our region outlier detection method 
can provide consistently good results in different situations. 
In addition, our method does not miss any blobs at the edge 
of the regions of interest, as shown in subfigures [9b] [9gJ [9c] 
and[9h| It is interesting to see that large-scale blob structures 
are often generated, which could cause substantial plasma 
transport (40) . As pointed out in (50) , these large-scale 
structures are mainly contributed by the low-frequency and 
long-wavelength fluctuating components, which may be 
responsible for the observations of long-range correlations. 
We also noticed that different poloidal planes may display 
significant diversity in edge turbulence, even in the same 
time frame. We have shown that we are able to effectively 
detect the blobs and reveal some interesting results to help 
physicists improve their understanding of the characteristic 
of blobs and their correlation with other plasma properties. 
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Blob Detection: time frame 45 and Poloidal plane 1 



Blob Detection: time frame 45 and Poloidal plane 1 


(a) Contouring-based methods 

Blob Detection: time frame 87 and Poloidal plane 1 




2.25 2.26 


2.27 2.28 2.29 

R value 


(b) Region outlier detection method 

Blob Detection: time frame 87 and Poloidal plane 1 



2.27 2.28 2.29 

R value 


(c) Contouring-based methods 


(d) Region outlier detection method 


Fig. 8: Two examples of comparing our region outlier detection method with the Contouring-based methods in the R 
(radial) direction and the Z (poloidal) direction. The separatrix position is shown by a white line and the different pink and 
blue circles denote blobs. 


5.3 Blob tracking results 


We investigate the blob tracking results in two different 
situations. Figure 10a exhibits a 2D trajectory of a blob. 
Again, the trajectory is generated by plotting the location of 
the density peak of the detected blobs over five consecutive 
time frames. We can see that our blob tracking algorithm 
can track two separate blobs simultaneously. The blob size 
can grow when they move towards confined plasma in the 
right region of separatrix. Figure |10b| shows a 3D trajectory 
for a detected blob over fifteen consecutive time frames. 
In this case, the blob seems to maintain its size for a few 
time frames, then gradually decreases, and eventually dis¬ 
appears. Through these interesting results, physicists may 
be able to understand the characteristics of blobs better. 


5.4 Real-time blob detection under strong scaling 


architecture as follows: 

. runtime of Blob detection using single core 

speedup = - 

runtime using F cores 

Our most encouraging results are that we can complete 
blob detection on the simulation data set described above in 
around 2 milliseconds with MPI/OpenMP using 4096 cores 
and in 3 milliseconds with MPI using 1024 cores. In Figure 
|TT) we can achieve linear time speedup in blob detection 
time under strong scaling. The MPI and the MPI/OpenMP 
implementations accomplish 800 and 1200 times speedup 
respectively, when the number of processes is scaled to 1024. 
Also, we can see that the hybrid MPI/OpenMP implementa¬ 
tion is about two times faster than the MPI implementation 
when varying the number of processes from 1 to 512. With 
1024 processes, both of them achieve similar performance, 
but the MPI/OpenMP one is slightly better. This demon¬ 
strates that we are able to control analysis speed by varying 
the number of processes to achieve real-time analysis. 


We have illustrated the effectiveness and robustness of 
the proposed blob detection and tracking methods. Next, 
we perform a set of experiments to demonstrate the per¬ 
formance of our real-time blob detection approach under 
strong scaling and weak scaling. We define the speedup 
of our parallel implementation on heterogeneous multi-core 


5.5 Real-time blob detection under weak scaling 

In this experiment, we evaluate the performance of our 
real-time blob detection under weak scaling. We replicate 
existing data sets (30GB) in order to obtain adequate ex¬ 
perimental data sets (4.3TB). The basic unit data contains 
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(a) Frame 82 and plane 1 (b) Frame 83 and plane 1 (c) Frame 84 and plane 1 (d) Frame 85 and plane 1 (e) Frame 86 and plane 1 


(f) Frame 82 and plane 2 (g) Frame 83 and plane 2 (h) Frame 84 and plane 2 


(i) Frame 85 and plane 2 (j) Frame 86 and plane 2 



(p) Frame 82 and plane 4 (q) Frame 83 and plane 4 (r) Frame 84 and plane 4 (s) Frame 85 and plane 4 (t) Frame 86 and plane 4 


Fig. 9: An example of the blob detection in five continuous time frames and four different poloidal planes in the R (radial) 
direction and the Z (poloidal) direction. The separatrix position is shown by a white line and the different blue circles 
denote blobs. 



(a) 2D trajectory for detected blobs 


(b) 3D trajectory for detected blobs 


Fig. 10: 2D and 3D center trajectories for detected blobs over consecutive time frames. The red solid polygon indicates the 
starting times of the blobs tracked while the blue broken polygons indicate subsequent times of the same blobs tracked. The 
centers of the moving blobs are linked to show their trajectories of the blob motion. The pink line represents the separatrix. 


128 time frames and the size of data increases linearly with indicates that our implementations scale very well to solve 
the number of processes. In Figure [l2j the blob detection much larger problems. Also, both MPI and MPI/OpenMP 
time remains almost constant under weak scaling, which implementations achieve high parallel efficiency as the 
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Fig. 11: Blob detection time and speedup with MPI and MPI/OpenMP varying number of processes under strong scaling 


Blob Detection Runtime: Weak Scaling 


Blob Detection Efficiency: Weak Scaling 



Fig. 12: Blob detection time and speedup with MPI and MPI/OpenMP varying number of processes under weak scaling 


number of processes increases from 1 to 1024. 

6 Conclusion and future work 

Near real-time extraction of spatio-temporal features in very 
large-scale irregular data presents both opportunities and 
challenges responding to extreme scale computing and big 
data in many applications. In this paper, we propose, for the 
first time, a real-time blob detection and tracking approach 
for finding blob-filaments in fusion experiments or numer¬ 
ical simulations. The key idea of the proposed approach is 
to break down the overall process into three steps. The first 
two steps are based on a distribution-based outlier detection 
scheme with various criteria and a fast CCL method to find 
blob components. In the third step, an efficient blob tracking 
scheme is presented to recover the trajectories of the mo¬ 
tions of blobs. Our hybrid MPI/OpenMP implementations 
demonstrate the effectiveness and efficiency of the proposed 
approach with a set of fusion plasma simulation data. Our 
tests show that we can achieve linear time speedup and 
complete blob detection in two or three milliseconds using 
a cluster at NERSC. 


We are currently working on integrating our blob detec¬ 
tion algorithm into the ICEE system for consuming fusion 
plasma data streams where the blob detection function is 
used in a central data analysis component and the resulting 
detection results are monitored and controlled from portable 
devices, such as an iPad. We plan to test the proposed 
method in both simulations and real fusion experiments. 
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